On 11th October we held our first JHOVE online hack day. Our aim was to catalogue error messages produced by JHOVE to get a better understanding of their meaning and potential preservation impact.
Background: organising an online hack day
We have been considering running online hackathons because attending face-to-face events has become more difficult as budgets have been squeezed. We also recognise it’s difficult for anyone outside of Europe to participate, despite holding our events in different European cities. This impression was reinforced during the iPRES panel session ‘Challenges and benefits of a collaboration of the Collaborators’; international cooperation is very important, but too many physical events take place in Europe, creating a barrier.
This was our first online, interactive hack event. To make it accessible to international contributors, we organised a number of check in calls over 24 hours from 09:00 NZDT Tuesday 11th October (21:00 BST Monday 10th October).
Goals and preparation
The topic was driven by our Document Interest Group (DIG), who have been documenting JHOVE PDF and some TIFF error messages on the OPF wiki on an ongoing basis. Understanding JHOVE’s errors messages, and their implications, is an important issue for OPF members and JHOVE users. The DIG agreed to all dedicate a day to help progress this work while opening it up to anyone else in the community who wanted to take part.
Many thanks to David Russo and Peter May from The British Library, who helped prepare the error catalogues used on the Hack Day through both automation and hard work.
Tasks
We prepared a task list on GitHub to manage the activities. It was also available as a waffle board for those who prefer a kanban approach.
For each module, we created a spreadsheet template to catalogue the:
- error ID
- error message
- type of message
- explanation of the message
- example files
- source code location
- impact
- cure
The information was populated by participants during the day and will be used to generate JHOVE error wiki pages.
We also created other technical and non-technical tasks including refactoring error messages, reviewing the JHOVE website and the new JHOVE Guide for Beginners.
Participants
A big thanks to our participants:
- Peter May, British Library
- David Russo, British Library
- Yvonne Tunnat, Deutsche Zentralbibliothek für Wirtschaftswissenschaften
- Kati Sein, Rahvusarhiiv
- Jonas Frellesen, Kongelige Bibliotek
- Thomas Ledoux, Bibliothèque nationale de France
- Jody Palmer, Preservica
- Ross Spencer, Archives New Zealand
- Andrea Byrne, Archives New Zealand
- Isabel Meyer, Smithsonian Institution
- Tyler Thorsted, Church History Library
- Jay Gattuso and Sean at the National Library of New Zealand, who not only contributed during 10th October in their local time zone, but carried on to work through 10th October in Europe!
What we achieved
Together, we completed, or made progress on a large number of the tasks:
- PDF module errors extracted and logged
- JPEG module errors extracted and logged
- TIFF module errors extracted and logged
- WAVE module errors extracted and logged
- AIFF module errors extracted and logged
- UTF8 module errors extracted and logged
- XML errors extracted and logged
- HTML module errors extracted and logged
- PNG module errors extracted and logged
- Reviewed the OPF Black Box Tester
- Improvements to README, project POMs and RELEASENOTES
- Documented PDF reference information to add to the website
- Began a review of the JHOVE website, and made some updates
- Reviewed the new JHOVE for beginners guide to add to the website
- TEST FILES 🙂 We got plenty of files that demonstrated particular JHOVE errors, these are really important for regression testing and we need more!
Next steps
We had so much input it will take a little time and effort to review. Initially we’ll be:
- Checking the test files submitted to ensure they demonstrate the error they’re supposed to
- Adding checked test files to the automated JHOVE regression tests
- Reviewing pull requests for inclusion in the next JHOVE release
- Making some improvements to the JHOVE website based on review feedback
- Working to use the spreadsheet information to generate a new JHOVE Wiki that provides contextual information for JHOVE error messages.
The next release of JHOVE will be in November. It will include tested contributions from the JHOVE Hack Day, as well as issues and bug fixes selected by the JHOVE Product Board.
Follow-on event
The day was a great success. We’ve already had requests to run another JHOVE Hack Day so are provisionally planning for a follow-on online event in the first quarter of 2017. We’ve asked our participants for feedback on their experience of the day so we can make a couple of improvements for next time. We will also investigate other topics for our online hack days.
In the meantime if anyone has any further test files that demonstrate JHOVE module validation errors please get in touch.
Thank you!
Once again – a huge THANK YOU to everyone who participated and supported the event. We were overwhelmed with the energy and enthusiasm of the participants and the progress we made in such a short time. A truly fantastic international community effort!
Become a JHOVE Software Supporter
If your organisation uses JHOVE, please consider becoming a Software Supporter to help enable regular maintenance and development of the software. JHOVE Software Supporters guide the roadmap and can receive training and technical support.